29 research outputs found

    Self-Supervised Shape and Appearance Modeling via Neural Differentiable Graphics

    Get PDF
    Inferring 3D shape and appearance from natural images is a fundamental challenge in computer vision. Despite recent progress using deep learning methods, a key limitation is the availability of annotated training data, as acquisition is often very challenging and expensive, especially at a large scale. This thesis proposes to incorporate physical priors into neural networks that allow for self-supervised learning. As a result, easy-to-access unlabeled data can be used for model training. In particular, novel algorithms in the context of 3D reconstruction and texture/material synthesis are introduced, where only image data is available as supervisory signal. First, a method that learns to reason about 3D shape and appearance solely from unstructured 2D images, achieved via differentiable rendering in an adversarial fashion, is proposed. As shown next, learning from videos significantly improves 3D reconstruction quality. To this end, a novel ray-conditioned warp embedding is proposed that aggregates pixel-wise features from multiple source images. Addressing the challenging task of disentangling shape and appearance, first a method that enables 3D texture synthesis independent of shape or resolution is presented. For this purpose, 3D noise fields of different scales are transformed into stationary textures. The method is able to produce 3D textures, despite only requiring 2D textures for training. Lastly, the surface characteristics of textures under different illumination conditions are modeled in the form of material parameters. Therefore, a self-supervised approach is proposed that has no access to material parameters but only flash images. Similar to the previous method, random noise fields are reshaped to material parameters, which are conditioned to replicate the visual appearance of the input under matching light

    Single-image Tomography: 3D Volumes from 2D Cranial X-Rays

    Get PDF
    As many different 3D volumes could produce the same 2D x-ray image, inverting this process is challenging. We show that recent deep learning-based convolutional neural networks can solve this task. As the main challenge in learning is the sheer amount of data created when extending the 2D image into a 3D volume, we suggest firstly to learn a coarse, fixed-resolution volume which is then fused in a second step with the input x-ray into a high-resolution volume. To train and validate our approach we introduce a new dataset that comprises of close to half a million computer-simulated 2D x-ray images of 3D volumes scanned from 175 mammalian species. Applications of our approach include stereoscopic rendering of legacy x-ray images, re-rendering of x-rays including changes of illumination, view pose or geometry. Our evaluation includes comparison to previous tomography work, previous learning methods using our data, a user study and application to a set of real x-rays

    Learning a Neural 3D Texture Space from 2D Exemplars

    Full text link
    We propose a generative model of 2D and 3D natural textures with diversity, visual fidelity and at high computational efficiency. This is enabled by a family of methods that extend ideas from classic stochastic procedural texturing (Perlin noise) to learned, deep, non-linearities. The key idea is a hard-coded, tunable and differentiable step that feeds multiple transformed random 2D or 3D fields into an MLP that can be sampled over infinite domains. Our model encodes all exemplars from a diverse set of textures without a need to be re-trained for each exemplar. Applications include texture interpolation, and learning 3D textures from 2D exemplars

    CamP: Camera Preconditioning for Neural Radiance Fields

    Full text link
    Neural Radiance Fields (NeRF) can be optimized to obtain high-fidelity 3D scene reconstructions of objects and large-scale scenes. However, NeRFs require accurate camera parameters as input -- inaccurate camera parameters result in blurry renderings. Extrinsic and intrinsic camera parameters are usually estimated using Structure-from-Motion (SfM) methods as a pre-processing step to NeRF, but these techniques rarely yield perfect estimates. Thus, prior works have proposed jointly optimizing camera parameters alongside a NeRF, but these methods are prone to local minima in challenging settings. In this work, we analyze how different camera parameterizations affect this joint optimization problem, and observe that standard parameterizations exhibit large differences in magnitude with respect to small perturbations, which can lead to an ill-conditioned optimization problem. We propose using a proxy problem to compute a whitening transform that eliminates the correlation between camera parameters and normalizes their effects, and we propose to use this transform as a preconditioner for the camera parameters during joint optimization. Our preconditioned camera optimization significantly improves reconstruction quality on scenes from the Mip-NeRF 360 dataset: we reduce error rates (RMSE) by 67% compared to state-of-the-art NeRF approaches that do not optimize for cameras like Zip-NeRF, and by 29% relative to state-of-the-art joint optimization approaches using the camera parameterization of SCNeRF. Our approach is easy to implement, does not significantly increase runtime, can be applied to a wide variety of camera parameterizations, and can straightforwardly be incorporated into other NeRF-like models.Comment: SIGGRAPH Asia 2023, Project page: https://camp-nerf.github.i

    D32.1: Individual Use Cases and Test Scenarios Definition

    Get PDF
    ecoDriver targets a 20% reduction of CO2 emissions and fuel consumption in road transport by encouraging the adoption of green driving behaviour. Drivers will receive eco-driving recommendations and feedback adapted to them and to their vehicle characteristics. A range of driving profiles, powertrain

    Femtosecond Transfer and Manipulation of Persistent Hot-Trion Coherence in a Single CdSe/ZnSe Quantum Dot

    Full text link
    Ultrafast transmission changes around the fundamental trion resonance are studied after exciting a p-shell exciton in a negatively charged II-VI quantum dot. The biexcitonic induced absorption reveals quantum beats between hot trion states at 133 GHz. While interband dephasing is dominated by relaxation of the P-shell hole within 390 fs, trionic coherence remains stored in the spin system for 85 ps due to Pauli blocking of the triplet electron. The complex spectro-temporal evolution of transmission is explained analytically by solving the Maxwell-Liouville equations. Pump and probe polarizations provide full control over amplitude and phase of the quantum beats

    Generative Modelling of BRDF Textures from Flash Images

    Get PDF
    We learn a latent space for easy capture, semantic editing, consistent interpolation, and efficient reproduction of visual material appearance. When users provide a photo of a stationary natural material captured under flash light illumination, it is converted in milliseconds into a latent material code. In a second step, conditioned on the material code, our method, again in milliseconds, produces an infinite and diverse spatial field of BRDF model parameters (diffuse albedo, specular albedo, roughness, normals) that allows rendering in complex scenes and illuminations, matching the appearance of the input picture. Technically, we jointly embed all flash images into a latent space using a convolutional encoder, and -- conditioned on these latent codes -- convert random spatial fields into fields of BRDF parameters using a convolutional neural network (CNN). We condition these BRDF parameters to match the visual characteristics (statistics and spectra of visual features) of the input under matching light. A user study confirms that the semantics of the latent material space agree with user expectations and compares our approach favorably to previous work

    Unsupervised learning of 3D object categories from videos in the wild

    No full text
    Our goal is to learn a deep network that, given a small number of images of an object of a given category, reconstructs it in 3D. While several recent works have obtained analogous results using synthetic data or assuming the availability of 2D primitives such as keypoints, we are interested in working with challenging real data and with no manual annotations. We thus focus on learning a model from multiple views of a large collection of object instances. We contribute with a new large dataset of object centric videos suitable for training and benchmarking this class of models. We show that existing techniques leveraging meshes, voxels, or implicit surfaces, which work well for reconstructing isolated objects, fail on this challenging data. Finally, we propose a new neural network design, called warp-conditioned ray embedding (WCR), which significantly improves reconstruction while obtaining a detailed implicit representation of the object surface and texture, also compensating for the noise in the initial SfM reconstruction that bootstrapped the learning process. Our evaluation demonstrates performance improvements over several deep monocular reconstruction baselines on existing benchmarks and on our novel dataset. For additional material please visit: https://henzler. github.io/publication/unsupervised_videos/

    A quantitative CT parameter for the assessment of pulmonary oedema in patients with acute respiratory distress syndrome.

    No full text
    ObjectivesThe aim of this study was to establish quantitative CT (qCT) parameters for pathophysiological understanding and clinical use in patients with acute respiratory distress syndrome (ARDS). The most promising parameter is introduced.Materials and methods28 intubated patients with ARDS obtained a conventional CT scan in end-expiratory breathhold within the first 48 hours after admission to intensive care unit (ICU). Following manual segmentation, 137 volume- and lung weight-associated qCT parameters were correlated with 71 clinical parameters such as blood gases, applied ventilation pressures, pulse contour cardiac output measurements and established status and prognosis scores (SOFA, SAPS II).ResultsOf all examined qCT parameters, excess lung weight (ELW), i.e. the difference between a patient's current lung weight and the virtual lung weight of a healthy person at the same height, displayed the most significant results. ELW correlated significantly with the amount of inflated lung tissue [%] (pConclusionsELW could serve as a non-invasive method to quantify the amount of pulmonary oedema. It might serve as an early radiological marker of severity in patients with ARDS
    corecore